2,097 research outputs found
Personalized Pancreatic Tumor Growth Prediction via Group Learning
Tumor growth prediction, a highly challenging task, has long been viewed as a
mathematical modeling problem, where the tumor growth pattern is personalized
based on imaging and clinical data of a target patient. Though mathematical
models yield promising results, their prediction accuracy may be limited by the
absence of population trend data and personalized clinical characteristics. In
this paper, we propose a statistical group learning approach to predict the
tumor growth pattern that incorporates both the population trend and
personalized data, in order to discover high-level features from multimodal
imaging data. A deep convolutional neural network approach is developed to
model the voxel-wise spatio-temporal tumor progression. The deep features are
combined with the time intervals and the clinical factors to feed a process of
feature selection. Our predictive model is pretrained on a group data set and
personalized on the target patient data to estimate the future spatio-temporal
progression of the patient's tumor. Multimodal imaging data at multiple time
points are used in the learning, personalization and inference stages. Our
method achieves a Dice coefficient of 86.8% +- 3.6% and RVD of 7.9% +- 5.4% on
a pancreatic tumor data set, outperforming the DSC of 84.4% +- 4.0% and RVD
13.9% +- 9.8% obtained by a previous state-of-the-art model-based method
Heuristic Search over a Ranking for Feature Selection
In this work, we suggest a new feature selection technique that lets us use the wrapper approach for finding a well suited feature set for distinguishing experiment classes in high dimensional data sets. Our method is based on the relevance and redundancy idea, in the sense that a ranked-feature is chosen if additional information is gained by adding it. This heuristic leads to considerably better accuracy results, in comparison to the full set, and other representative feature selection algorithms in twelve well–known data sets, coupled with notable dimensionality reduction
Digging into acceptor splice site prediction : an iterative feature selection approach
Feature selection techniques are often used to reduce data dimensionality, increase classification performance, and gain insight into the processes that generated the data. In this paper, we describe an iterative procedure of feature selection and feature construction steps, improving the classification of acceptor splice sites, an important subtask of gene prediction.
We show that acceptor prediction can benefit from feature selection, and describe how feature selection techniques can be used to gain new insights in the classification of acceptor sites. This is illustrated by the identification of a new, biologically motivated feature: the AG-scanning feature.
The results described in this paper contribute both to the domain of gene prediction, and to research in feature selection techniques, describing a new wrapper based feature weighting method that aids in knowledge discovery when dealing with complex datasets
A Methodology for the Diagnostic of Aircraft Engine Based on Indicators Aggregation
Aircraft engine manufacturers collect large amount of engine related data
during flights. These data are used to detect anomalies in the engines in order
to help companies optimize their maintenance costs. This article introduces and
studies a generic methodology that allows one to build automatic early signs of
anomaly detection in a way that is understandable by human operators who make
the final maintenance decision. The main idea of the method is to generate a
very large number of binary indicators based on parametric anomaly scores
designed by experts, complemented by simple aggregations of those scores. The
best indicators are selected via a classical forward scheme, leading to a much
reduced number of indicators that are tuned to a data set. We illustrate the
interest of the method on simulated data which contain realistic early signs of
anomalies.Comment: Proceedings of the 14th Industrial Conference, ICDM 2014, St.
Petersburg : Russian Federation (2014
Is This a Joke? Detecting Humor in Spanish Tweets
While humor has been historically studied from a psychological, cognitive and
linguistic standpoint, its study from a computational perspective is an area
yet to be explored in Computational Linguistics. There exist some previous
works, but a characterization of humor that allows its automatic recognition
and generation is far from being specified. In this work we build a
crowdsourced corpus of labeled tweets, annotated according to its humor value,
letting the annotators subjectively decide which are humorous. A humor
classifier for Spanish tweets is assembled based on supervised learning,
reaching a precision of 84% and a recall of 69%.Comment: Preprint version, without referra
Predicting sentence translation quality using extrinsic and language independent features
We develop a top performing model for automatic, accurate, and language independent prediction of sentence-level statistical machine translation (SMT) quality with or without looking at the translation outputs.
We derive various feature functions measuring the closeness of a given test sentence to the training data and
the difficulty of translating the sentence.
We describe \texttt{mono} feature functions that are based on statistics of only one side of the parallel
training corpora and \texttt{duo} feature functions that incorporate statistics involving both source and
target sides of the training data.
Overall, we describe novel, language independent, and SMT system extrinsic features for predicting the SMT performance, which also rank high during feature ranking evaluations.
We experiment with different learning settings, with or without looking at the translations, which help differentiate the contribution of different feature sets.
We apply partial least squares and feature subset selection, both of which improve the results and we present ranking of the top features selected for each learning setting, providing an exhaustive analysis of the extrinsic features used.
We show that by just looking at the test source sentences and not using the translation outputs at all, we can
achieve better performance than a baseline system using SMT model dependent features that generated the
translations.
Furthermore, our prediction system is able to achieve the nd best performance overall according to the official
results of the Quality Estimation Task (QET) challenge when also looking at the translation outputs.
Our representation and features achieve the top performance in QET among the models using the SVR learning model
Orientational instabilities in nematics with weak anchoring under combined action of steady flow and external fields
We study the homogeneous and the spatially periodic instabilities in a
nematic liquid crystal layer subjected to steady plane {\em Couette} or {\em
Poiseuille} flow. The initial director orientation is perpendicular to the flow
plane. Weak anchoring at the confining plates and the influence of the external
{\em electric} and/or {\em magnetic} field are taken into account. Approximate
expressions for the critical shear rate are presented and compared with
semi-analytical solutions in case of Couette flow and numerical solutions of
the full set of nematodynamic equations for Poiseuille flow. In particular the
dependence of the type of instability and the threshold on the azimuthal and
the polar anchoring strength and external fields is analysed.Comment: 12 pages, 6 figure
A transfer-learning approach to feature extraction from cancer transcriptomes with deep autoencoders
Publicado en Lecture Notes in Computer Science.The diagnosis and prognosis of cancer are among the more
challenging tasks that oncology medicine deals with. With the main aim
of fitting the more appropriate treatments, current personalized medicine
focuses on using data from heterogeneous sources to estimate the evolu-
tion of a given disease for the particular case of a certain patient. In recent
years, next-generation sequencing data have boosted cancer prediction by
supplying gene-expression information that has allowed diverse machine
learning algorithms to supply valuable solutions to the problem of cancer
subtype classification, which has surely contributed to better estimation
of patient’s response to diverse treatments. However, the efficacy of these
models is seriously affected by the existing imbalance between the high
dimensionality of the gene expression feature sets and the number of sam-
ples available for a particular cancer type. To counteract what is known
as the curse of dimensionality, feature selection and extraction methods
have been traditionally applied to reduce the number of input variables
present in gene expression datasets. Although these techniques work by
scaling down the input feature space, the prediction performance of tradi-
tional machine learning pipelines using these feature reduction strategies
remains moderate. In this work, we propose the use of the Pan-Cancer
dataset to pre-train deep autoencoder architectures on a subset com-
posed of thousands of gene expression samples of very diverse tumor
types. The resulting architectures are subsequently fine-tuned on a col-
lection of specific breast cancer samples. This transfer-learning approach
aims at combining supervised and unsupervised deep learning models
with traditional machine learning classification algorithms to tackle the
problem of breast tumor intrinsic-subtype classification.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech
- …